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1 Introduction 



It is natural to ask whether the Shannon entropy of a ra-dimensional random 
vector with density p, defined as 

H{p) = — J p(x) logp(x)(ix, 

represents the only possible measure of uncertainty. For example, Renyi J7] 
introduces axioms on how we would expect such a measure to behave, and 
shows that these axioms are satisfied by a more general definition, as follows: 



Definition 1.1 Given a probability density p valued on M", for g 7^ 1 define 
the q-Renyi entropy to be: 



Note that by L'Hopital's rule, since ^a* = a* logg a, 
r ut\ r -/p(x)''logp(x)dx 

hm Hq{p) = hm = H{p). (1) 

g^i g^i j ^(xj^dx 

As Gnedenko and Korolev [T2] remark, under a variety of natural conditions 
the distributions which maximise Shannon entropy are well-known ones, with 
interesting properties. This paper gives parallels to some of these properties 
for the Renyi maximisers. 



1. Under a covariance constraint Shannon entropy is maximised by the 
Gaussian distribution. In Proposition 11.31 we review the fact that un- 
der a covariance constraint Renyi entropy is maximised by Student 
distributions. 

2. The Gaussians have the appealing property of stability (that is, given 
Zi and Z2 Gaussians, Zi + Z2 is also Gaussian). In Definition 12.21 we 
introduce the ^-convolution, which generalizes the addition operation. 
In Lemma 12.31 we extend the stability property by showing that if -Ri 
and R2 are Renyi maximisers then so is Ri-k R2. 
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3. The Entropy Power Inequality (see Equation ((Tj) below) shows that 
the Gaussian represents the extreme case for how much entropy can 
change on addition. Theorem 12.41 gives the equivalent of an Entropy 
Power Inequality, with the Renyi maximisers playing an extremal role. 

4. The Gaussian density satisfies the heat equation, which leads to a rep- 
resentation of Shannon entropy as an integral of Fisher Informations 
(known as the de Bruijn identity). In Theorem 13.11 we show that the 
Renyi densities satisfy a generahzation of the heat equation, and deduce 
what quantity must replace the Fisher information in general. 

First, as in Costa, Hero and Vignat 5j, we identify the Renyi maximising 
densities, which are Student-t and Student-r distributions, and review some 
of their properties which we will use later in the paper. 

Definition 1.2 For n/{n + 2) < q and q ^ 1, define the n- dimensional 
probability density Qq^c o-s 

<7,,c(x) = A,(l-(g-l)/3x^C-ix)^ (2) 

with 

P = P,- ^ 



2q — n {1 — q) 
and normalization constants 



ijq > 1. 



) 


|C| 




)7r"/2| 







Here x+ = max(x, 0) denotes the positive part. We write Rg,c for a random 
variable with density Qq^c, which has mean and covariance C. 

Notice that if we write fig c for the support of gq^C: then for g > 1, fi^ c = 
{x : x^C^^x < 2q/{q - 1) + n}, and for g < 1, fi^'c = M". 

Note further that since lim,_,i r(l/(l - g))(l - g)"/Vr(l/(l - g) - n/2) = 1 

and limg^i (l - (g - l)/5x^C"^x)^ = exp(-x'^C"^x/2), the limit 

limg^ic/g,c(x) = ^i,c(x) = ((27r)"|C|)-^/2gxp(-x^C-^x/2), the Gaussian 
density. Throughout this paper, we write Zc for a A/'(0, C) random variable. 

We now state the maximum entropy property, as follows. 
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Proposition 1.3 Given any q > n/{n + 2), and positive definite symmetric 
matrix C, among all probability densities f with mean and ^ /(x)xx"^(ix = 
C, the Renyi entropy is uniquely maximised by Qq^c, that is 

with equality if and only if f = Qq^c almost everywhere. 

Proof See Section EU □ 

Throughout this paper, we write Xm for a random variable with density 

2l-m/2 / 

(Strictly speaking, this is only a x random variable when the parameter m is 
an integer, but it is simpler to adopt the convention of allowing non-integer 
m than to refer to the square root of a r(m) random variable with scale 
factor 2). 

We briefly review stochastic representations of the Renyi maximisers, which 
we will use throughout the paper. For the sake of completeness, we present 
proofs of these results in Section rA.21 Part 1. of Proposition 11.41 follows for 
example from P.393 of Eaton [lOj, Part 2. of Proposition 11.41 is stated in 
Dunnett [9 , and Part 3. of this proposition is a multivariate version of a 
result stated as long ago as 1915 by Fisher [TT]. 

Proposition 1.4 Writing Rg,c for a n-dimensional q-Renyi maximiser with 
mean and covariance C, and writing Zc for a J\f{0, C): 

1. Student-r. For any q> 1, writing m = n + 2q/{q — 1) 

Rq,cU ~ Z^c, (4) 
where U ~ Xm (independent ofUg^c)- 

2. Student-t. For any n/{n + 2) < q < 1, writing m = 2/{l — q) —n > 2, 

R-g.C ~ '^{m-2)c/U, (5) 

where U ~ Xm (independent ofZ). 
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3. Duality. Given matrix D, define the map 

eD(x) = , 

For q < 1, writing m = 2/{l — q) — n, if Rg,c is a Renyi maximiser, 
then 6c{m-2)(Rg,c) ~ R.p,c*; where l/{p - 1) = 1/(1 - g) - n/2 - 1 
(so q <1 implies that p > 1) and C* = C((m — 2)/(m + n)). 

Proof See Section Hil □ 

Stocliastic representations (jH) and (0) can be used to compute the covariance 
and entropy of Rg,c- For example, for g < 1, since U ~ Xm, the IE— = 

— i-^, so that Gov (Rg,c) = EZ(™_2)cZf„_2)cE^ = (m - 2)C^^^, 
claimed. 



as 



Similarly for g < 1, the Shannon entropy Hi (Rg,c) is given by (writing 
m = 2/(1 — g) — n) 

-Elog^q,c(Rg,c) = -logAq+— — Elog 1 + -^ 



(m - 2)f/2 



-logA, + ^^Elog(l + 



2 ^ V 

m + n , 2 2 \ 

= - log + ^E (log Xm^n - log Xm) 

where N ~ A/'(0, 1), and since ElogXm — ^ (y) where \Ef(-) is the digamma 
function, we obtain 

(R„c) - log A, + ^ (* (^) - * - I)) . (6) 



Remark 1.5 Indeed, the theory of such stochastic representations can be 
generalized from the setting of 110 and to multivariate maximizers with 
different powers. That is, given a positive sequence (pi, . . . ,Pn), the solution 
to the problem 

maxHg (X) such that E\Xi\'^' = Ki 
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is a random vector X with density given by 

( 

/ (x) oc II + y^g^lxj 
V i=i 

where it can he shown that the all have the same sign as 1 — q. Moreover, 
ifX. is such a maximizer with q > I, then for k = 1, . . . ,n random variables 
Zk = Ul^^'^Xk are independently power- exponential distributed with marginal 
densities 

1 

Ph 

Pl^Cb 

f [zk] = —=-^Q^v{o,k\zk\^'') , Ofc < 

when Uk is x-distributed with m = 2/ {q — l) + 2 + ^^^^ 2/pi degrees of freedom 
and independent oflL. 




2 ★-convolution and relative entropy 

In this section, we introduce a new operation, which we refer to as the -k- 
convolution. In Lemma 12.31 we show that this ^ir-convolution preserves the 
class of Renyi entropy maximisers, and in Theorem 12.41 show that it satisfies 
a version of the entropy power inequality. 

We will say that a distribution is g- Renyi if it maximises the g-Renyi entropy. 
For the sake of simplicity, we write D(X||y) = -Di(/x||/y) for the relative 
entropy between the two densities fx and /y of random variables X and Y . 
We define a new distance measure: 

Definition 2.1 Given a n-dimensional random vector T with mean and 
covariance C, we define its distance from a n-dimensional q-Renyi maximiser 
^q,c (for q > 1) to be 

d{T\R,,c) = D{TU\\Z), 

where U is a Xm random variable (with m = n+2q/ {q — l) degrees of freedom) 
independent ofT, and Z ~ Af{0,mC). 
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Note that d inherits positive definiteness from D - that is (i(T|Rg c) > 0, 
with equahty if and only if T ~ Rg,c- Note further that Equation (jl3p below 
implies that 



Motivated by Proposition 11.41 we make the following definition: 

Definition 2.2 For fixed q > 1, given two n- dimensional random vectors 
S,T, with covariance matrices Cs and Ct, define the -kg- convolution (or 
just -k- convolution) of S and T to be the n-dimensional random vector 



2q/{q — 1) degrees of freedom. 

Again, notice that as g ^ 1, U^-^/{2ql{q - 1)) ^ 1 and V/{2q/{q - 1)) 1 
by the Law of Large Numbers, so S ^ T S + T. 

Lemma 2.3 For q > 1, if S and T are q-Renyi entropy maximisers with 
covariances Cg and Ct then S * T zs also a q-Renyi entropy maximiser, 
with covariance Cs + Ct- 

Proof By Proposition I1.4I 1. writing m = n + 2q/{q — 1), we know that 
[/(•^^S and t/^^^T are A/'(0,mCs) and Ar(0,mCT) respectively. We define 
g by 1/(1 — g) = 1 + l/(g — 1) + n/2, and write m = 2/(1 — q) — n = 
2g/(g— 1) = m—n. Then random variable W = [m — 2) / m{U^^^S+U^'^^T) 
is 7V(0, (m - 2)C), where C = Cg + Ct- 

Then (by Proposition 11.41 2) since V has m degrees of freedom, W/V^ is q- 
Renyi, with covariance C. Finally (by Proposition II. 41 3) . 0(m-2)c(W/l^) 
is g*-Renyi, where l/(g* — 1) = 1/(1 — qi) — n/2 — 1 = l/(g — 1), so 
in fact it is g-Renyi with covariance C(m — 2)/(m + n). Hence, S ^ir T = 



d(T|R,,c) = D(TU\\Yi,,cU) < Z}(T||R,,c). 




(m^)S + f/(^)T) 
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■\Jm/{'m — 2)6(m-2)c(W/l^) is g-Renyi with covariance Cm/(m + n) = C, 
and the result follows. □ 



We now give a new (T^r-convolution) version of the classical Entropy Power 
Inequality, which was first stated by Shannon as Theorem 15 of [18^, with a 
'proof sketched in Appendix 6. More rigorous proofs appeared in Blachman 
|3] and later in Dembo, Cover and Thomas jH]. The result gives that for 
independent n-dimensional random vectors X and Y, 



with equality if and only if X and Y are Gaussian with proportional covari- 
ance matrices. 

Writing Cx for the covariance matrix of X, we know that D(X||Zx) = 
(n log(27re) + log |Cx|)/2 — H{X.), so that the Entropy Power Inequality ^ 
is equivalent to 



Cx + CYr/"exp(-2D(X + Y||Zcx+c^)/n) 
> |Cxr/"exp(-2D(X||Zc,)/n) + |CYr/"exp(-2D(Y||Zc,)/n). (8) 



We give an equivalent of Equation (jH)), with the ^-convolution replacing the 
operation of addition. 

Theorem 2.4 Given q > 1, for independent n- dimensional random vectors 
S, T with mean and covariances Cs, Ct, 



Cs + Cxr/" exp(-2rf(S * T|R,,Cs+Cx)/^) 
> |Csr/"exp(-2rf(S|R,,Cs)/^) + 10x1'/" exp(-2rf(T|R,,CT)/^), 



with equality if and only ifS and T are q-Renyi with proportional covariance 
matrices. 

Proof By Proposition I A. 51 below we know that for U^^^ , U^'^^ , V, W all in- 
dependent and x-distributed, where U'^^\ U^'^\ W have m = n + 2q/(q — 1) 
degrees of freedom, and V has 2q/{q — 1) degrees of freedom: 



exp(2i7(X + Y) /n) > exp(2iJ(X) /n) + exp{2H(Y) /n) 



(7) 



d{S -k TIR^^Cs+Ct) 
= D((S^T)W^||Z„(Cs+Cx)) 
_ / (t/(^)S + f/(^)T) 




W Z™(Cs+Ct) 



\ + f/(^)T)^C-i(f/(^)S + f/{^)T) + 

< D(f/(^)S + f/(^)T||Z„(C3+c,)). 



(9) 
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We can combine Equations (jS)) and © to obtain that 

|mCs + mCxr/" exp(-2rf(S * T|R,,Cs+Ct)/^) 

> |mCs + mCxr/" exp{-2D{U^'^S + U^^^T\\Z^^Cs+c^))/n) 

> |mCsr/"exp(-2Z}(f/(^)Sl|Z^Cs)/^) 

+ |mCTr/"exp(-2Z}(f/(^)T||Z„CT)/^) 
= |mCsr/"exp(-2rf(S|R,,Cs)/^) + |mCTr/" exp(-2t;(T|R,,CT)/^), 

and the resuh follows. Equality holds in Equation © if U^^'^S + U^'^'^T is 
Gaussian. This, along with proportionality of covariance matrices, is also the 
condition for equality in Equation (jHJ. □ 

There is a parallel theory for the case q < 1, where we define a o-convolution: 



Definition 2.5 For fixed q satisfying n/{n + 2) < q < 1, given two random 
vectors S and T with covariance matrices Cg and Ct respectively, define the 
o-convolution by 

S O T = 0(;^_2)(Cs+Ct) (^0('n-2)Cs (S) 0(m-2)CT 

with m = 2/{l—q) —n, where the -k- convolution is taken with respect to index 
q satisfying — 1) = m/2 — 1 and 

eo^(x) ^ 



Vl - X^D-iX 
This definition satisfies an analogue of Lemma 12.31 

Lemma 2.6 For q < 1, if S and T are q-Renyi entropy maximisers with 
covariances Cs and Ct then S o T is also a q-Renyi entropy maximiser, 
with covariance Cg + Ct ■ 

Proof By Proposition II. 41 3. S = 0(m-2)Cs (S) maximises g-Renyi entropy 
with g > 1 such that l/{q - 1) = 1/(1 - g) - n/2 - 1. 

Moreover, the covariance matrix of S is Cq = ^^^Cc. The same result holds 

' f m+n ^ 

for T and Cr^ = ^^C^. As a consequence of Lemma 1^31 S^irgT is a g-Renyi 
distribution with covariance C = Cg + Cj. 
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Since bv Proposition ll.41 3. 0(m-2)c(R'g,c) = R-gC' where C = (m— 2)C/(m+ 
n), taking inverse maps, 0(„^j_2)c(-^g c) ~ ^g c- Here C = C(m + n)/{m — 
2) = (Cg + C^)(m + n)/{m - 2) = Cs + Ct, as required. □ 



3 (/-heat equation and g-Fisher information 

In this section, we show that the Renyi maximising distributions satisfy a 
version of the de Bruijn identity. That is, we can define a Fisher information 
quantity, and show in Equation (jllj) that it is the derivative of entropy. First, 
we compute the exact constants in a result of Compte and Jou 

Theorem 3.1 For a fixed fi, write fr for the density of a Rg,T''C random 
variable. If fi = 2/(2 + n{q — l)/2) then satisfies a heat equation of the 
form 

with 

_ ,2qi2 + niq-l)) 
^ 2q + n{q-l) 

Proof By Equation (0), we know that for a general choice of fi: 

= ^-[^ 7^ ) ' ^"^''^ ^ = 2^(1 - q)- 



First note that 



(x) = (x) -— + 1 . (10) 



Further, for any k, writing A = C^^: 

•9 _ / (g-l)/3x^C-ix\^ /-2g/3(Ax)fe- 



-fr (x) 



dxk r"'?^/2 V / V T"^ 
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Hence, for any fc, /: 



( -2qf3AM / 4g/j^(Ax)fc(Ax)A / (g - l)/3x^C-^x \ 



T 



Overall, we deduce that 



fc,( 



SO that equating this with Equation (fTI]|) we obtain: 

Now, we want this to not be a function of r, so take = 2/(2 + n{q — 1)), 
and substitute for P to obtain 

^ , 2q{2 + n{q-l)) 
" ^ 2g + n(g-l) ' 

as claimed. □ 



Note that the value of the exponent /i coincides with the one given by Compte 
and Jou j3j. Further, as liniq^i = 1, so that lim^^i Kg = 2, as we would 
expect from the de Bruijn identity given in Lemma 2.2 of Johnson and Suhov 

ca- 



ll 



We now evaluate the derivative of the Renyi entropy, extending the de Bruijn 
identity: 

±n(f^ = 1 (g-l)//.(xr^^/.(x)rfx 

dr 1-q jfri^Yd^ 

- -7j^i:c.//.(xr'^/;(xMx 

= ir-ig(g-l)tr(CJ,(/0), (11) 
where we make the following definitions: 

Definition 3.2 Given probability density p, define the q-score function 

P,(X) = Vp(x)/J9(x)2-^ 

and the q-Fisher information matrix to be 



J.(P) 



J p(x)'?(ix 



Note that the numerator is the case p = 2, A = g of the [p, A) Fisher infor- 
mation introduced in Equation (7) of fTBj. We establish a multi-dimensional 
Cramer-Rao inequality: 

Proposition 3.3 For the Fisher information Jg defined above, given a ran- 
dom variable with density p and covariance C then 

Up) - ii^^^c-' 

is positive definite, with equality if and only if p = gq,c everywhere. 
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Proof The key is a Stein-like identity, as usual found using integration by 
parts, since 

J p(x)(pg(x)),(Ax)fe(ix = J ^p(x)p^-^(x)(Ax)fedx 

= if i-y^^^^^^M''^ 

q J 

This means that for any real c, the positive definite matrix 
J p(x)(pg(x) + cAx)(pq(x) + cAxfdx 
= J p{x)pg{x)p^{x)dx + 2^A J p^(x)dx + c^A. 

So we choose c— [J p*(x)dx) /q, and the result follows. Note that equahty 

holds if and only if p = Qg^c everywhere, since the Renyi maximiscr has score 
function p(x) = A'^g-\-2/3)Ax, and J glc{x)dx/q = J gg,cAf\l - (3{q - 
l)x^Ax)/g(ix = Af\l - (3{q - l)n))/q = ^^"^2/3). □ 

Now, we can give the extensivity property for Fisher information defined in 
this way: 

Lemma 3.4 For a compound system of independent random vectors X and 
Y , for q> 1/2 the q-Fisher information satisfies: 



J,(X,Y) 



a,(Y)J,(X) 

a,(X)J,(Y) 



where constant ^^(X) = (/px ^(^)^^)/(/ Px(^)c^^) ctg(Y) similarly. 

Proof We write Px,y(x, y) = Px(x)pY(y), so that (omitting the arguments 
for clarity), we can express 

Vpx,Y = (pyVpx,PxVpy)- 
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Then 

//p^^-Vx'-VpxVV 

jpr'jpy.m 

since for q > 1/2, the off-diagonal term 

1 11— 2q-l \ I I Y7 2q-l 



{2q - 1) 



since this is a perfect derivative, and since px(x) ^ as x — > cxd. The result 
follows since 



Px,Y = / Px Py 



□ 



A Proofs 

A.l Maximum entropy property 

In this section we give a proof of Proposition 11.31 which shows that Qq^c are 
the Renyi entropy maximisers. The proof uses Lemma 1 of Lutwak, Yang and 
Zhang ^H] ; which extends the classical Gibbs inequality, and is equivalent to 
Lemma IA.2I below. 

Definition A.l For q ^ 1, given n- dimensional probability densities f and 
g, define the relative q- Renyi entropy distance from f to g to be 

DMa) = log f / g'^-\^)fi^)d^) + ^H,{g) - -H,{f). 
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For g = 1, we write Di{f\\g) = f /(x) log(/(x)/5f(x))(ix for the standard 
relative entropy. We justify this as an extension by continuity; as g — 1, as 
in (HD, D,{f\\g) ^ - / /(x) log^7(x)rfx - H^{f) = D^{f\\g). 

Lemma A. 2 For any q > 0, and for any probability densities f and g, the 
relative entropy Dg{f\\g) > 0, with equality if and only if f = g almost 
everywhere. 

Proof The case g = 1 is well-known. For g 7^ 1, as in Lutwak, Yang 
and Zhang JT^, the result is a direct application of Holder's inequality to 
exp Dg{f\\g). Although [TH] only strictly speaking considers the 1-dimensional 
case, the general case is precisely the same. □ 



As with the Shannon maximisers, we use this Gibbs inequality Lemma [A. 21 
to show that the densities of Definition 11.21 really do maximise the Renyi 
entropy. 

Proof of Proposition fTTSl Since / and gq^c have the same covariance matrix, 
[ (x^C-ix)/(x)rfx= f (x^C-ix)(7,,c(x)rfx. 

This means that for g 7^ 1 

[ <7fJW/(x)rfx = [ A^-i (1 - (g - l)/3x^C-ix) / (x) dx 

= / (1 - (g - l)/3x^C-ix) ^,,c (x) c?x 

<c(x)rfx. (12) 



For g = 1, the equivalent of the orthogonality property Equation ()12|) is the 
well-known fact that 

j /(x) log5(i,c(x)rfx = y 5'i,c(x) log5(i,c(x)rfx. 
Using Equation (fT^ we simply evaluate 
DM\\9,,c) = j^log (1 <7J(x)/(x)dx^ + ^-j^H,{g,,c) - -Hq{f) 
= -^{Hq{gq,c)-Hq{f)), 
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and so the result follows by Lemma f A. 21 



□ 



Note that this is an alternative proof to that given by Costa, Hero and Vignat 
[S], who introduced a non-symmetric directed divergence measure 

D, if\\g) = sign (g - 1) / ^ + ^g'i^) - f{^)g'-\^)d^. 

The approach of 5] is similar to that used by Cover and Thomas jHl p. 234] 
in the Gaussian case. The general theory of directed divergence measures is 
discussed by Csiszar jTj and by Ali and Silvey [T]. 

The paper ^Hj gives more general results concerning the maximum entropy 
property, in a more geometric context. 



A. 2 Stochastic Representation 
Proof of Proposition 11.41 

1. By Equation Q, since we take — 1) = 1/min Equation (j2I), the 
density of Rg,c^ can be expressed as 

^^^> ~ r(m/2) J, X" V rnx^ J exp }^ ^ J dx 

;Agexp ( — — ) K. 



r(m/2) " 2m 
Here since m — n — 2 = 2/(g — 1), taking = x"^ —y'^C'^y/m, so udu = xdx: 

Jo V / V 2 2m / 

r ^ f , Q 

= / M"?-! exp udu = 2'j-ir 

Jo V 2j \q-l 

and the result follows, since the constant 

2l-m/2 



r(m/2) " (27rm)"/2|C 



1 • 

2 
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since 1 — m/2 + — 1) = —n/2 and — 1) = 1/m. 
2. In the same way, the density of Z(m-2)c/t^ can be expressed as 

: exp I — ] exp I \dx 



r(m/2) 7o V(27r(m-2))«|C| ^ V 2(m - 2) ^ " V 2 



r(m/2) V 7r"|C| 7o 



r((n + m)/2) //3(l-g),^^^^_^ 



r(m/2) U 7r"|C| 



2 



r(i/(i - g)) //?(i - g) _^ y^c-V^ ~ 



r(l/(l-g) -n/2) y 7r"|C| V 

writing d = 1 + (y"^C^^y)/(m — 2), and using the facts that (m + n)/2 = 
1/(1 — q) and l/(m — 2) = /9(1 — g), the result follows. 

3. For this choice of parameters, X = Rg,c has density Ag(l+x^D~^x)^/'''^~^\ 
IfY = GdIX), we can calculate the Jacobian |aX|/|aY| = (l-Y^D-iY)-i-"/2_ 
Then, the standard change-of- variables relation gives that, since l-Y^D-^Y = 
(1 + X-^D~^X)^^, we know that Y has density 

Thus, in particular, taking X ~ R.q,c and D = C(m — 2), we know that 

-1 n 1 

9Yiy) = {l-y^B-'y)- ^A,(l-y^D-V)"^ 

= A,(l-y^D-V)^. 

Since p > 1, we know that Y has covariance D/3p(p — 1) = D/(2p/ (p — 1) + 
n)~^ = C(m — 2)/(m + n). 

Further A, = (/3,(1 - q)r/'T (^) / (r (y^ - f ) W^ld^) 

= {(3,{p-l)r/'T + I + / (^r + l) 7r"/2|(m - 2)/(m + n)C|l) = 
Ap, as required. □ 

Note that an alternate, stochastic proof of Equation Q can be deduced from 
the polar factorization property of Student-r vectors (see [2^ for a detailed 
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study): if X is orthogonally invariant and X = rU where U is uniformly 
distributed on the sphere, then r = ||X|| and U = X/||X|| are independent. 
Since Rg,c is the marginal of a vector U uniformly distributed on the sphere, 
we deduce that 

iy/Z^Z + Xm-n 

where Z is a Gaussian vector, and where random variable ^Z'^Z + Xm-n is 
chi distributed with m degrees of freedom and independent of Rg,c- Thus, 
multiplying Rg,c by an independent chi-distributed random variable with 
m degrees of freedom yields a Gaussian vector with covariance matrix mC, 
which is exactly Equation 



A. 3 Projection results 

To prove the Entropy Power Inequality, Theorem 12.41 we prove a technical 
result. Proposition IA.5I This relies on two well-known results. Lemma lA.31 
and Lemma I A. 41 Firstly as a consequence of the chain rule for relative 
entropy (see for example Theorem 2.5.3 of Cover and Thomas jH]): 

Lemma A. 3 For pairs of random variables {X,Y) and {U,V), 

D{{X,YmU,V))>D{X\\U). 

Equality holds if and only if for each x, the random variables Y\X = x and 
V\U = X have the same distribution. In particular if {X,Y) and {U,V) 
are independent pairs, equality holds if and only if Y and V have the same 
distribution. 

Secondly, we recall a projection identity, first stated as Corollary 4.1 of [H]: 

Lemma A. 4 For random vectors X and Y , and for any invertible function 

D(<I>(X)||<I>(Y)) = /^(XIIY). 
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Proposition A. 5 For a n-dimensional random vector'Wl, take N ~ X2q/{q~i) 
and U ~ X2g/(g-i)+n; where (yL,N,U) are independent: 



D 



M 



where equality holds «/M is Af{0, C). 



U 



Zc <D{M\\Zc] 



Proof By combining Lemmas IA.3I and I A. 41 if random variables Q and S 
have the same distribution and (P, Q) and (R, S) each form independent 
pairs then 

D{FQ\\RS) < DiiPQ,Q)\\iRS,S)) = DiiP,Q)\\iR,S)) = DiP\\R). (13) 

Now, we define Y ~ X2q/{q-i) and V ~ X2q/{q-i)+n, both independent of Zc, 
so that f/ and V have the same distribution, as do and Y. The LHS of 
the proposition becomes: 



D 



M 



M 



-.U 



< D 



v/ZgC-iZc + y2 
Zc 



VM^C-iM + 
= D(ec(M/Ar)||ec(Zc/r)) 
= D(M/Ar||Zc/r) 
< D(M||Zc), 



ZgC-iZc 



y2 



(14) 



(15) 
(16) 



and the result follows. Here Equation (|T^ follows by Equation ()13j) . Equation 
p5|) follows by Lemma Fa .41 and Equation ()16|) again follows by Equation ()13|). 
□ 
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